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Storm surge, the onshore rush of sea water caused by the high 
winds and low pressure associated with a hurricane, can compound 
the efTects of inland flooding caused by rainfall, leading to loss of prop- 
erty and loss of life for residents of coastal areas. Numerical ocean 
models are essential for creating storm surge forecasts for coastal ar- 
eas. These models are driven primarily by the surface wind forcings. 
Currently, the gridded wind fields used by ocean models are specified 
by deterministic formulas that are based on the central pressure and 
location of the storm center. While these equations incorporate im- 
portant physical knowledge about the structure of hurricane surface 
wind fields, they cannot always capture the asymmetric and dynamic 
nature of a hurricane. A new Bayesian multivariate spatial statisti- 
cal modeling framework is introduced combining data with physical 
knowledge about the wind fields to improve the estimation of the 
wind vectors. Many spatial models assume the data follow a Gaussian 
distribution. However, this may be overly-restrictive for wind fields 
data which often display erratic behavior, such as sudden changes in 
time or space. In this paper we develop a semiparametric multivariate 
spatial model for these data. Our model builds on the stick-breaking 
prior, which is frequently used in Bayesian modeling to capture un- 
certainty in the parametric form of an outcome. The stick-breaking 
prior is extended to the spatial setting by assigning each location a 
different, unknown distribution, and smoothing the distributions in 
space with a series of kernel functions. This semiparametric spatial 
model is shown to improve prediction compared to usual Bayesian 
Kriging methods for the wind field of Hurricane Ivan. 
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1. Introduction. Modeling surface wind fields is essential for hurricane 
forecasting. A wind field gives the wind velocity at any location in the vicin- 
ity of the hurricane. The numerical ocean models used to predict the storm 
surge for coastal areas rely heavily on wind field inputs. Currently, deter- 
ministic formulas such as the Holland model [Holland (1980)] are used to 
generate the wind fields for the storm surge model based on a few meteoro- 
logical inputs such as the radius and central pressure of the storm. 

While the Holland model captures many of the important features of a 
wind field, Foley and Fuentes (2006) show that this model does not allow for 
asymmetries often seen in wind fields and that storm surge prediction can be 
improved by supplementing the Holland model with a Gaussian geostatis- 
tical model. Another approach would be to introduce a more sophisticated 
deterministic wind model. A coupled atmospheric-oceanic numerical model 
can be used to simulate the surface winds at the boundary layer of the 
ocean model. However, the CPU time required to produce these modeled 
winds at high enough resolution for coastal prediction (1 to 4 km grids) 
prevents such model runs from being used in real-time applications. Alter- 
natively, one could write a stochastic version of the deterministic model and 
approximate the physical model using a stochastic spatial basis. This is the 
approach of Wikle et al. (2001) for oceanic surface winds. 

This paper proposes a semiparametric multivariate spatial model to pre- 
dict a wind field. The predictions in this paper are purely spatial predic- 
tions made using multiple sources of observed data and Holland model out- 
put from a single time point. Several Gaussian multivariate spatial covari- 
ance models have been proposed. For example. Brown, Le and Zidek (1994) 
model the joint covariance of the observed multivariate data using an inverse 
Wishart distribution centered on a separable covariance matrix. Another ap- 
proach is to represent the multivariate spatial process as a linear combination 
of univariate spatial process. Variations of this linear model of coregioniza- 
tion (LMC) have been used by Grzebyk and Wackernagel (1994), Wacker- 
nagel (2003), Schmidt and Gelfand (2003), Banerjee, Carlin and Gelfand 
(2004) and Gelfand et al. (2004). Foley and Fuentes (2006) apply the LMC 
to the two orthogonal west/east and north/south components of hurricane 
wind vectors. 

Spatial models often assume the outcomes follow normal distributions. 
The Gaussian assumption is difficult to verify empirically and may be overly- 
restrictive for hurricane wind field data, which can display erratic behavior, 
such as sudden changes in time or space. For example, on the periphery of the 
map in Figure 1(a) the wind vectors vary smoothly from one measurement 
to the next. However, near the eye of the hurricane (center of the plot), 
the wind vectors are extremely volatile. Traditional Gaussian spatial models 
tend to oversmooth the area near the eye of the hurricane, resulting in a poor 
fit. Therefore, in this paper we develop a new multivariate semiparametric 
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spatial model for these data that avoids specifying a Gaussian distribution 
for the spatial random effects. 

Our semiparametric model avoids assuming normality by extending the 
stick-breaking prior of Sethuraman (1994) to the multivariate spatial set- 
ting. For general (nonspatial) Bayesian modeling, the stick-breaking prior 
offers a way to model a distribution of a parameter as an unknown quantity 
to be estimated from the data. The stick-breaking prior for the unknown 
distribution F is the mixture 



where the number of mixture components m may be infinite, pi are the 
mixture probabilities, and 5{9i) is the Dirac distribution with point mass at 
6i. The mixture probabilities "break the stick" into m pieces so the sum of 
the pieces is one, that is, J2iLiPi = 1- The first mixture probability is mod- 
eled as pi = Vi, where Vi ~ Beta(a, b). Subsequent mixture probabilities are 
Pi = {1 — J2]^iPj)^ij where 1 — X]}=iPj is the probability not accounted for 

by the first i—1 mixture components, and Vi ' Beta(a, b) is the proportion 
of the remaining probability assigned to the ith component. The locations 
0. ^-^^ p^^ where Fq is a known prior distribution. A special case of this prior 
is the Dirichlet process prior with m = oo and a = 1 [Ferguson (1973, 1974)]. 

The stick-breaking prior in (1) has been extended to the univariate spatial 
setting by incorporating spatial information into either the model for the 
locations 6i or the model for the masses pi. Gelfand, Kottas and MacEachern 
(2005a) and Gelfand, Guindani and Petrone (2007) model the locations as 
vectors drawn from a spatial distribution. This approach is generalized by 
Duan, Guindani and Gelfand (2007) to allow both the weights and locations 
to vary spatially. However, these approaches require replication, and thus are 
not appropriate for analyzing the wind fields data. Grifhn and Steel (2006) 
propose a spatial Dirichlet model that does not require replication. Their 
model permutes the Vi based on spatial location, allowing the prior to be 
different in different regions of the spatial domain. 

This paper is the first to extend the stick-breaking prior to the multi- 
variate spatial setting. Our semiparametric multivariate spatial model for a 
hurricane wind field has bivariate normal priors for the locations 6i. Simi- 
lar to Griffin and Steel, the probabilities pi vary spatially. However, rather 
than a random permutation Vi, we introduce a series of kernel functions 
to allow the masses to change with space. This results in a flexible spatial 
model, as different kernel functions lead to different relationships between 
the distributions at nearby locations. This model is similar to that of Dunson 
and Park (2007), who use kernels to smooth the weights in the non-spatial 
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setting. Our model is also computationally convenient because it avoids re- 
versible jump MCMC steps and inverting large matrices which is crucial for 
analysis of hurricane wind fields since estimates must be made in real time. 

The paper proceeds as follows. Section 2 describes the various sources 
of data used to map the wind field. The semiparametric spatial prior for 
univariate spatial data is introduced in Section 3. This model is extended 
to a multivariate model to analyze wind field data in Section 4. The model 
incorporates both a deterministic wind model and multiple sources of wind 
observations and allows for potential bias for each data source. This model 
is used to map the wind field of Hurricane Ivan in Section 5. Section 6 
concludes. 



2. Description of the wind fields data. We model wind fields data 
from Hurricane Ivan as it passed through the Gulf of Mexico at 12 pm 
on September 15, 2004. The three sources of information used in 
this analysis are plotted in Figure 1. The first source is gridded 
satellite data [Figure 1(a)] available from NASAs SeaWinds database 
(http://podaac.jpl.nasa.gov/products/productl09.html). These data are avail- 
able twice daily on a 0.25 x 0.25 degree global grid. Due to the satellite 
data's potential bias, measurement error and course temporal resolution, we 
supplement our wind fields analysis with data from NOAA's National Data 
Buoy Center. Buoy data are collected every ten minutes at a relatively small 
number of marine locations [Figure 1(b)]. These measurements are adjusted 
to a common height of 10 meters above sea level using the algorithm of 
Large and Pond (1981). 

In addition to satellite and buoy data, our model incorporates the de- 
terministic Holland model [Holland (1980)]. The NOAA currently uses this 
model alone to produce wind fields for their numerical ocean models. The 
Holland model predicts that the wind speed at location s is 

f B f Rmax \^ , „ „ , \ ( Rmax \ 

(2) H{s) = [^-i^—-j (P„-P,)exp[-(^— -j 

where r is the radius (km) from the storm center to site s, P„ is the ambient 
pressure (mb), Pc is the hurricane central pressure (mb), p is the air density 
(kg m~^), Rmax is radius of the maximum wind (km), and B controls the 
shape of the pressure profile. 

Section 4's multivariate spatial model decomposes the wind vectors into 
their orthogonal west/east (u) and north/south (v) vectors. The Holland 
model for the u and v components is 

(3) Hu{s)=H{s)sm{(f>) and Hy{s) = H{s) cos{(p), 

where 4> is the inflow angle at site s, across circular isobars toward the 
storm center, rotated to adjust for the storm's direction. We fix the pa- 
rameters Pn = 1010 mb, Pc = 939 mb, p = 1.2 kg m^^, and Rmax = 49 and 
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Fig. 1. Plot of various types of wind field data/output for Hurricane Ivan on September 
15, 2004. 

= 1.9 using the meteorological data from the national hurricane center 
(http://www.nhc.noaa.gov) and recommendations of Hsu and Yan (1998). 
The output from this model for Hurricane Ivan is plotted in Figure 1(c). 
By construction, the Holland model output is symmetric with respect to 
the storm's center, which does not agree with the satellite observations in 
Figure 1(a). 

3. The spatial stick-breaking (SSB) prior. In this section we develop a 
univariate semiparametric spatial model for data from a single source. The 
spatial stick-breaking prior developed here is incorporated into our model 
for the wind fields data in Section 4. Let ?/(s), the observed value at site 
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s = (si, S2), have the model 

(4) y(s)=/i(s) + x(sy/3 + e(s), 

where /x(s) is a spatial random effect, x(s) is a vector of covariates for site 

s, (3 are the regression parameters and e(s) A^(0, cj^). 

The spatial effects are each assigned a different prior distribution, that is, 
/i(s) ~F(s). The distributions F{s) are unknown and smoothed spatially. 
Extending (1) to depend on s, the prior for F{s) is the potentially infinite 
mixture 

m 

(5) F{s)^Y.P^{s)5{e,), 

i=l 

where pi{s) = Vi{s), pi{s) = Vi{s)ll^J^{l - Vj{s)) for i > 1, and ^^(8) = 
Wi{s)Vi. The distributions F{s) are related through their dependence on the 
Vi and 6i, which are given the priors Vi ~ Beta(a, 6) and 6i A^(0,r^), 
each independent across i. However, the distributions vary spatially ac- 
cording to the kernel functions Wi{s), which are restricted to the inter- 
val [0,1]. The function Wi(s) is centered at knot tp^ = {tpu,ip2i) and the 
spread is controlled by the bandwidth parameter = {eii,€2i)- Both the 
knots and the bandwidths are modeled as unknown parameters with pri- 
ors that are independent of the Vi and 9i. The knots ip^ are given inde- 
pendent uniform priors over the bounded spatial domain (this is gener- 
alized in Section 4). The bandwidths can be modeled as equal for each 
kernel function or varying across kernel functions following prior distribu- 
tions. 

Although there are many possible kernel functions. Table 1 gives two ex- 
amples. Uniform kernels offer bounded support. This is an attractive feature 
when modeling hurricane wind fields because wind behavior may be different 
in different subregions, for example, in the center of the storm versus the 
periphery. We compare uniform kernels with squared-exponential kernels. 
Squared-exponential kernels decay slowly in space which may be desirable 
in other applications. 

To ensure that the stick-breaking prior is proper, we must choose priors 
for ej and Vi so that J2i^iPii^) = 1 almost surely for all s. Appendix A.l 
shows that the SSB prior with infinite m is proper if E{Vi) = a/{a + b) and 
E[wi{s)] [where the expectation is taken over e^)] are both positive. For 
finite m, we can ensure that J27^iPi{^) — 1 ^or all s by setting V^(s) = 1 for 
all s. This is equivalent to truncating the infinite mixture by attributing all 
of the mass from the terms with i > m to pm{s). 

In practice, allowing m to be infinite is often unnecessary and computa- 
tionally infeasible. Choosing the number of components in a mixture model 
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Table 1 

Examples of kernel functions and the induced functions 7(3, s'), where 
hi = |si — s'll + |s2 — S2i, /i2 = \A?L^^l)M-(s2 — /(•) «s t/ie indicator function, 

and — max{x, 0) 



Name Wi(s) Model for eii and e2i 7(3, s') 



Uniform 


nt 






eii, £22 = A 




Uniform 






£u,e2i ~ Expo(A) 


exp(— /ti/A) 


Squared exp. 


n= 


^1 exp(- 




Eli, e2i = A^/2 


0.5exp(-^) 






Squared exp. 


n= 


^1 exp(- 


J' 2 


eii,e2i~IG(1.5, ^) 


0.5/(1 + 







is notoriously problematic. Fortunately, in this setting the truncation er- 
ror can easily be accessed by inspecting the distribution of pm{s), the mass 
of the final component of the mixture. The number of components m can 
be chosen by generating samples from the prior distribution of We 
increase m until Pm{s) is satisfactorily small for each site s. Also, the trun- 
cation error is monitored by inspecting the posterior distribution of pm{s), 
which is readily available from the MCMC samples. 

Assuming finite m, the spatial stick-breaking model can be written as a 
mixture model where g{s) £ {1, . . . ,m} indicates site s's group, that is, 

y(s) = 0^(,)+x(s)'/3 + e(s), where e(s) iV(0, a^), 

0/-i^^-iV(O,r2), j = l,...,m, 

(6) 

g{s) ~ Categorical(pi(s), . . . ,pm(s)), 

Pj{s) = Wj{s)Vj Y[[l- Wk{s)Vk], 
k<j 

where /.f(s) = ^^(s), Vj Beta(a, 6), and nfc<j[l ~ ^fc(s)Vfc] = 1 for j = 1. To 
complete the Bayesian model, we specify priors for the hyperparameters. The 
regression parameters (3 can be given vague normal priors. In the analysis 
of Hurricane Ivan in Section 5, the mean term x(s)'/3 is replaced by the 
Holland model output. The parameters that control the beta prior for the 
Vi, a and h, have independent Uniform(0, 10) priors, and the variances cj^ and 
have InvGamma(0.01, 0.01) priors. We also tried InvGamma(0.5, 0.005) 
priors for the variances and found that the prior had little effect. The knots 
that control the center of the kernel functions, are given uniform priors 
over the spatial domain and examples of priors for bandwidth parameters 
ej are given in Table 1. The prior for the bandwidths depend on a range 
parameter. A, which is given a Uniform(0, Amax) prior. We take Amax to 
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Fig. 2. Example to illustrate the spatial stick-breaking prior. In this example, the 
spatial domain is the one- dimensional interval (0, 1) and the model has Gaussian 
kernels with knots t/j = (0.5, 0.0, 1.0, 0.2, 0.8), bandwidths e = (0.1, 0.2, 0.2, 0.2, 0.2) and 
= (0.9,0.7,0.7,0.9,0.9). Panel (a) shows the masses pi{s) and panel (b) shows the cor- 
relation between /i(s) and /i(s'). 



be the maximum distance between any pair of points in the spatial grid. 
This model can be implemented using WinBUGS. WinBUGS can be freely 
downloaded from http://www.mrc-bsu.cam.ac.uk/bugs/. 

The mixture model in (6) is nowhere continuous unless uniform kernels 
are selected and Vi £ {0, 1} for all i. An alternative suggested by a referee is 

(7) g{s)=j where =max{pi(s),...,prn.(s)}. 

This would result in a piece-wise constant random tessellation model which 
may be preferred for smooth spatial data. However, to avoid oversmoothing 
micro-scale phenomena in hurricanes, we use the everywhere discontinuous 
model in (6). 

Figure 2(a) illustrates the spatially varying weights of the stick-breaking 
prior for a one-dimensional example with m = 6 and squared exponential 
kernel functions. We arbitrarily select knots if: = (0.5,0.0, 1.0,0.2,0.8), band- 
widths e = (0.1, 0.2, 0.2, 0.2, 0.2) and V = (0.9, 0.7, 0.7, 0.9, 0.9). The first ker- 
nel function is centered at s = 0.5. Since Vi = 0.9, the mass for the first 
component for s = 0.5 is pi(0.5) = 0.9 and decreases as s moves away from 
0.5. The second and third kernel functions are centered at s = 0.0 and s = 1.0 
respectively, and dominate the probabilities near the edges. For this exam- 
ple, Pm(s) is as large as 0.2, suggesting m should be increased to give an 
acceptable approximation to the infinite spatial stick-breaking prior. 

Understanding the spatial correlation function is crucial for analyzing 
spatial data. Although the spatial stick-breaking prior forgoes the Gaussian 



BAYESIAN SPATIAL MODELING FRAMEWORK 



9 



assumption for the spatial random effects, we can still compute and inves- 
tigate the covariance function. Conditional on the probabilities Pj{s) (but 
not the locations 6j), the covariance between two observations is 

m 

i=i 

Figure 2(b) maps the correlation function induced by the probabilities in 
Figure 2(a). For these probabilities, the correlation is not simply a function 
of distance between points, that is, the correlation is nonstationary. For 
example, the correlation is near one for all sites in (0.4, 0.6) due to the large 
probability for the first component throughout the region. In contrast, the 
correlation between nearby sites is smaller near s = 0.35 and s = 0.65 where 
several components have substantial probability. 

As shown in Appendix A. 2, integrating over (T^, •0^, e^) and letting oo 
gives 



(9) Var(y(s))= (T^+r^, 

(10) Cov(y(s),y(s')) = A(s,s') 
where 



a + 1 



(11) 7(s,s) = — TT- — — vm ^iO'lJ- 

J J 'Wi{s)p{ipi,ei)dipi dei 

Since (T^, t/jj, Cj) have independent priors that are uniform over the spatial 
domain, integrating over these parameters gives a stationary prior covari- 
ance. However, Figure 2 illustrates that the conditional covariance can be 
nonstationary. Therefore, we conjecture the spatial stick-breaking model is 
more robust to nonstationarity than traditional stationary Kriging methods. 

If b/ (a -|- 1) is large, that is, the Vi are generally small and there are many 
terms in the mixture with significant mass, the correlation between y(s) 
and y(s') is approximately propotional to 7(s,s'). Table 1 gives the function 
7(s,s') for several examples of kernel functions and shows that different ker- 
nels can produce very different correlation functions. For example, 7(3, s') 
under the uniform kernel with exponential priors for the bandwidth param- 
eters is the familiar exponential correlation function. If the bandwidths are 
shared across kernel functions, 7(3, s') is proportional to a squared expo- 
nential covariance under squared exponential kernel functions. The uniform 
kernel with common bandwidth parameter A gives compact support, as ob- 
servations separated by more than A spatial units are uncorrelated. 
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4. A multivariate spatial model for wind fields data. Let n(s) and v{s) 
be the underlying wind speed in the west /east and north/south directions, 
respectively, for spatial location s. As described in Section 2, there are two 
types of observed wind data: ut{s) and vt{s) are satellite measurements 
and ub{s) and vb{s) are buoy measurements. Our model for these data is 

uris) = a„ + u{s) + e„T(s), vt{s) = a^, + v{s) + e^ris), 

(12) 

ub{s) = u{s) + e„B(s), ^^(s) = t;(s) + e„B(s), 

where {e^Tj e„T) e^B, e„s} are independent (with each other and with the 
underlying winds), zero mean, Gaussian errors, each with its own variance, 
and {au,ay} account for additive bias in the satellite data. Of course, the 
buoy data may also have bias, but it is impossible to identify bias from both 
sources, so we attribute all the bias to the satellite measurements. It is also 
possible to add multiplicative bias terms, but with the small number of buoy 
observations it will be difficult to identify both types of bias and Foley and 
Fuentes (2006) found that the primary source of bias is additive. 

The underlying orthogonal wind components u(s) and v{s) are modeled as 
a mixture of a deterministic wind model and a semiparametric multivariate 
spatial process 

(13) 

v{s)=H^{s)+Ry{s), 

where Hu{s) and Hy{s) are the orthogonal components of the determinis- 
tic Holland model in (3) and R(s) = (i?„(s), i?t,(s))' follows a multivariate 
extension of the non-Gaussian spatial stick-breaking prior of Section 3. We 
take R(s) ~ -F(s), where F has the stick-breaking prior in (5) modified so the 

two-dimensional locations 9i have multivariate normal priors 6>i ~ Ar(0,E), 
where S is a 2 x 2 covariance matrix that controls the association between 
the two wind components. The covariance S has an InvWish(0.1, 0.I/2) prior 
and after transforming the spatial grid to be contained in the unit square, 
the spatial knots ips^^i and ^ps2^ have independent Beta(1.5, 1.5) priors to en- 
courage knots to lie near the center of the hurricane where the wind is most 
volatile. Also, we take the spatial range A~ Uniform(0, 1). 

Assuming the same priors for the Pi{s) as in Section 3 and following 
the same steps as in Appendix A. 2, it can be shown that Cov(R(s), R(s')) 
is separable, that is, the product of the spatial covariate and the cross- 
dependency covariance matrix S. This could be generalized by allowing the 
prior covariance of the 9i to vary spatially. Alternatively, the spatial stick- 
breaking prior could be combined with the linear model of coregionalization 
to give a nonseparable multivariate spatial model by modeling the u and v 
components of R(s) as linear combinations of univariate spatial terms given 
spatial stick-breaking priors described in Section 3. 
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5. Analysis of Hurricane Ivan's wind field. We fit three models to 182 
satellite observations and 7 buoy observations for the Hurricane Ivan. We 
use the multivariate SSB model in Section 4 with both uniform and squared- 
exponential kernels. Also, to illustrate the effect of relaxing the normality 
assumption, we also fit a fully- Gaussian Bayesian Kriging model [Hand- 
cock and Stein (1993)] that replaces the stick-breaking prior for R(s) = 
{Rs{s), Ry{s)y in (13) with a zero-mean Gaussian prior with separable co- 
variance 



(14) Var(R(s)) = S and Cov(R(s),R(s')) = S x exp(-||s - s'||/A), 



where S controls the dependency between the wind components at a given 
location and A is a spatial range parameter. The covariance parameters S 
and A have the same priors as the covariance parameters in Section 4. 

Since our primary objective is to predict wind vectors at unmeasured 
locations to use as inputs for numerical ocean models, we compare models in 
terms of expected mean squared prediction error [Laud and Ibrahim (1995) 
and Gelfand and Ghosh (1998)], that is. 



where, say, ut{s) is viewed as a replicate of the observed u component of 
the satellite measurement at site s, the summation is taken over all obser- 
vation locations, and the expectation is taken over the full posterior of all 
the parameters in the model. This model selection criteria favors predictive 
models centered near the observed data with small predictive variances. 

The EMSPE is smaller for the semiparametric model uniform kernels 
(EMSPE = 3.46) than for the semiparametric model squared exponential 
kernels {EMSPE = 4.19) and the fully- Gaussian model {EMSPE = 5.17). 
Figures 3(a) and 3(b) show that the squared residuals from the fully-Gaussian 
fit are near zero for most of the spatial domain but are large near the center 
of the hurricane for both components. The Gaussian model oversmooths in 
this area with high volatility in the underlying wind surface. In contrast, the 
semiparametric model with uniform kernel functions is able to capture the 
peaks near the eye of the hurricane and the squared residuals [Figures 3(c) 
and 3(d)] show less spatial structure than the residuals from the Gaussian 
model. 

Figure 4 summarizes the posterior from the spatial stick-breaking prior 
with uniform kernel functions. The fitted values in Figures 4(a) and 4(b) vary 
rapidly near the center of the storm and are fairly smooth in the periphery. 



(15) 



EMSPE = Eij2(^T{s) -UT{s)f + {vt{s) - VT{s)f 
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Fig. 3. Squared residuals (value-posterior mean) for the u and v components of the 
Gaussian and spatial stick-breaking model with uniform kernels. The "*" represents the 
storm's center. 



After accounting for the Holland model, the correlation between the residual 
u and V components Rui^) and i?^(s) [Si2/\/SiiS225 where S^; is the {k,l) 
element of S] is generally negative [Figure 4(c)], confirming the need for a 
multivariate analysis. Figure 4(d) plots the posterior of the parameter that 
controls the size of bandwidths, A. The posterior median of A is 0.17, so on 
average the uniform kernels span about 17% of the spatial domain. 

The satellite data are significantly biased relative to the buoy data. The 
95% posterior intervals for the bias terms and ay are (—6.91,-2.16) and 
(0.04,4.38) respectively. The biases seem to be driven by the third buoy's 
wind vector in Figure 1(b), which is quite different from the nearby satellite 
observations in Figure 1(a). 

To show that the semiparametric model with uniform kernel functions 
fits the data well, we randomly (across u and v components and buoy and 
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Fig. 4. Summary of the posterior of the spatial stick-breaking model with uniform kernels. 
Panels (a) and (b) give the posterior mean surface for the u and v components, panel (c) 
shows the posterior of the cross-correlation between the residual wind components Ru{s) 
and Rv (s) fSi2 / x/SiiEii' ), and panel (d) plots the posterior of the parameter that controls 
the average kernel bandwidth X assuming the spatial grid has been transformed to lie in 
the unit square. The represents the storm's center. 



satellite data) set aside 10% of the observations and compute 95% predictive 
intervals for the missing observations. The prediction intervals contain 94.7% 
(18/19) of the deleted u components and 95.2% (20/21) of the deleted v 
components. These statistics suggest that our model is well calibrated. 

6. Discussion. Modeling hurricane wind fields is an important and chal- 
lenging problem. This paper presents a semiparametric multivariate spatial 
model for these data. The semiparametric model avoids oversmoothing near 
the center of Hurricane Ivan's wind field, resulting in a well-calibrated pre- 
dictive model. Gaussian models with highly-structured covariance functions, 
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for example, Wikle et al. (2001) and Fuentes et al. (2005), are an alternative. 
However, our non-parametric model offers greater flexibility by allowing for 
nonstationarity and nonnormality which is advantageous when building an 
automated procedure. 

In the statistical model for wind fields data, the spatial random effects 
with the spatial stick-breaking prior are mixed with independent normal 
errors. An extension of this model would be to replace the independent 
normal effects with a Gaussian spatial process. This would give a mixture 
of two spatial terms: a semiparametric term to handle discontinuities and 
a Gaussian process which performs well in smooth areas. A mixture of this 
nature has been considered by Lawson and Clark (2002), who propose a 
fully-parametric mixture of spatial models for disease mapping with a real 
spatial data. As Lawson and Clark point out, it can be difficult to identify 
the contribution of each component of the mixture, but using a combination 
of spatial terms can lead to an improvement in fit. 

This paper focused on estimating the wind field at a single time point 
because satellite data are only available twice daily. However, the spatial 
stick-breaking prior developed here could be extended to the spatiotempo- 
ral setting to improve real-time estimates. One possibility is to use three- 
dimensional kernel functions in space and time. An alternative spatiotem- 
poral model would be an extension of the dynamic linear model of Gelfand, 
Banerjee and Gammerman (2005b), that is, R(s,t) = i3R(s,t — 1) + A(s,t), 
where R(s, t) is the vector of residual wind components at location s at time 
t, B is diagonal with Ba £ [—1,1], and A(s,t) is the vector of changes from 
time t — 1 to time t. The spatial stick-breaking prior could be applied to the 
mean at the first time point, R(s, 1), and each A(s,t). 

APPENDIX A.l: PROPRIETY OF THE SSB PRIOR 

For infinite m, Ishwaran and James (2001) show that J2iLiPii^) = 1 al- 
most surely if and only if J2'^i -£'(log(l — Vi,{s))) = —oo. Applying Jensen's 
inequality, 

£;[log(l - Vi{s))] < log[i?(l - Vi{s))] = log[l - E{wi{s)}E{V^)]. 

If both E{wi{s)} and E{Vi) are positive, log(l — E{wi{s)}E{Vi)) is negative 
and 

oo oo 

5^i?[log(l - V,{s))] < 5^1og(l - E{w^is)}E{Vi)) = -oo. 

1=1 i=l 

APPENDIX A.2: COV(^(S), ;Li(S')) 

Due to the discrete nature of the stick-breaking prior, Cov(^(s), /i(s')) = 
r2prob(^(s) = ^(s')): 

Prob[/_f(s) =/_f(s')|Vi,'0j,ei] 
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oo 



= ^Pi(s)pi(s') 



1=1 



oo 



i=l L j<i 



Integrating over the (Vi,'0j,ej) gives 



oo 



Prob(/i(s) = fi{bs')) = C2i)2 Xll^ ~ ^"^i^i + ^"^^"^^ ^ 



1=1 



where ci = / / Wi{s)p{ipi,ei) dxjji dej, C2 = / / ■Wi{s)wi{s')p{il)i,ei) dxjji dei, vi = 
E(yi) = a/{a + b), and V2 = E{V^) = a{a + l)/[{a + b){a + b + 1)]. Since 
1 — 2ciVi + C2V2 = E[{1 — pj(s))(l —pi{s'))] £ [0, 1], we apply the formula for 
the sum of a geometric series and simplify, leaving 



where 7(3, s') = 02/01. 
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